A study of regularized Gaussian classifier in high-dimension small sample set case based on MDL principle with application to spectrum recognition

نویسندگان

  • Ping Guo
  • Yunde Jia
  • Michael R. Lyu
چکیده

In classifying high-dimensional patterns such as stellar spectra by a Gaussian classifier, the covariance matrix estimated with a small-number sample set becomes unstable, leading to degraded classification accuracy. In this paper, we investigate the covariance matrix estimation problem for small-number samples with high dimension setting based on minimum description length (MDL) principle. A new covariance matrix estimator is developed, and a formula for fast estimation of regularization parameters is derived. Experiments on spectrum pattern recognition are conducted to investigate the classification accuracy with the developed covariance matrix estimator. Higher classification accuracy results are obtained and demonstrated in our new approach. 2008 Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-regularization Parameters Estimation for Gaussian Mixture Classifier based on MDL Principle

Regularization is a solution to solve the problem of unstable estimation of covariance matrix with a small sample set in Gaussian classifier. And multi-regularization parameters estimation is more difficult than single parameter estimation. In this paper, KLIM_L covariance matrix estimation is derived theoretically based on MDL (minimum description length) principle for the small sample problem...

متن کامل

Regularized D-LDA for face recognition

Linear Discriminant Analysis (LDA) is derived from the optimal Bayes classifier when classes are assumed to be Gaussian with identical covariance matrices. However, it is well known that the distribution of face images under a perceivable variation in viewpoint, illumination or facial expression, is highly nonlinear and complex. The Quadratic Discriminant Analysis (QDA) which relaxes the identi...

متن کامل

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...

متن کامل

Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination

Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...

متن کامل

تحلیل ممیز غیرپارامتریک بهبودیافته برای دسته‌بندی تصاویر ابرطیفی با نمونه آموزشی محدود

Feature extraction performs an important role in improving hyperspectral image classification. Compared with parametric methods, nonparametric feature extraction methods have better performance when classes have no normal distribution. Besides, these methods can extract more features than what parametric feature extraction methods do. Nonparametric feature extraction methods use nonparametric s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2008